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Abstract 

Key predistribution is a well-known technique for ensuring secure communication via encryption among sensors 
deployed in an ad-hoc manner to form a sensor network. In this paper, we propose a novel 2-Phase technique for key 
predistribution based on a combination of inherited and random key assignments from the given key pool to individual 
sensor nodes. We also develop an analytical framework for measuring security-performance tradeoffs of different key 
distribution schemes by providing metrics for measuring sensornet connectivity and resiliency to enemy attacks. In 
particular, we show analytically that the 2-Phase scheme provides better average connectivity and superior g-composite 
connectivity than the random scheme. We then prove that the invulnerability of a communication link under arbitrary 
number of node captures by an adversary is higher under the 2-Phase scheme. The probability of a communicating node 
pair having an exclusive key also scales better with network size under the 2-Phase scheme. We also show analytically 
that the vulnerability of an arbitrary communication link in the sensornet to single node capture is lower under 2-Phase 
assuming both network-wide as well as localized capture. Simulation results also show that the number of exclusive 
keys shared between any two nodes is higher while the number of q-composite links compromised when a given number 
of nodes are captured by the enemy is smaller under the 2-Phase scheme as compared to the random one. 

Keywords: Sensor Networks, Key Distribution, Link Vulnerability. 

I. Introduction 

Sensor networks are autonomous systems of tiny sensor nodes equipped with integrated sensing and data processing 
capabilities. They can be deployed on a large scale in resource-limited and harsh environments such as seismic zones, 
ecological contamination sites or battlefields 1 1 1 , 1 8 1 . Their ability to acquire spatio-temporally dense data in hazardous 
and unstructured environments makes them attractive for a wide variety of applications |4|, 1 1 7 1 , \ 21\. Sensor networks 
(sensornets) are distinguished from typical ad-hoc wireless networks by their stringent resource constraints and larger 
scale. These operational constraints impose severe security challenges since sensornets may be deployed in hostile 
environments where nodes are subject to capture and communication links are subject to monitoring [ 13 1, 1 18], 1231 . 

ma, mo. 

Nodes in a sensornet are typically deployed in an ad-hoc manner into arbitrary topologies before self-organizing into 
a multihop network for collecting data from the environment and forwarding to the base station or sink 1 1 1,| 5 1. Estab- 
lishing a secure communication infrastructure among a collection of arbitrarily deployed sensor nodes is an important 
and challenging security issue (known as the bootstrapping problem \2\). Due to severe computational and memory 
constraints, symmetric key cryptography is the most feasible encryption mechanism for node to node communication. 
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However the high energy-cost of routing makes traditional methods of key exchange and key distribution protocols 
based on trusted third party mechanisms difficult to implement. 

Since bootstrapping should not rely on pre-existing trust associations between fixed sensor nodes or the availability 
of an on-line service to establish these trust associations, an attractive alternative for secure encrypted communication 
between adjacent sensor nodes is key predistribution, i.e. pre-installing a limited number of keys in sensor nodes prior 
to actual deployment. Key predistribution is also challenging since ad-hoc network deployment makes it impossible 
to pre-determine the neighborhood of any node, yet key distribution schemes must ensure good network connectiv- 
/fy(through key sharing) and resilience to node/key capture by the enemy even with limited number of keys per node. 
A trivial predistribution solution is to have a single secret key shared among all nodes. While this solution keeps 
the network fully connected (every node can communicate with every other node) and scalable (new nodes can be 
added without any keying overhead), it provides extremely poor resiliency to enemy attack. At the opposite end of the 
spectrum, one can have each pair of nodes sharing a distinct key. This solution provides both high connectivity and 
high security but is very memory-intensive and not scalable. 

There have been several recent works on key pre-distribution |6j, Q, 1141 . l3l . II II . The pioneering paper in (6| 
proposes a simple, scalable probabilistic key predistribution scheme in which a certain number of keys are drawn at 
random from a (large) key pool and distributed to sensor nodes prior to their deployment. Post-deployment, adjacent 
nodes participate in shared key discovery. A logical graph is created in which edges exist between adjacent sensor 
nodes sharing at least one key. This is followed by the establishment of paths between nodes using secure links in 
the logical graph. In |2|, the authors have presented new mechanisms for key establishment using the random key 
pre-distribution scheme of |6| as a basis. Their q-composite scheme requires that two adjacent communicating nodes 
have at least q keys in common. This scheme provides high resiliency against small scale enemy attack. 

Note that due to the random distribution of keys and adhoc deployment of sensors, there is a non-negligible prob- 
ability of a disconnected logical graph. The degree of connectivity of the resultant sensor network under a given key 
predistribution scheme is therefore an important performance metric. There is also a strong correlation between net- 
work connectivity and security. Adversaries that capture nodes can gain complete information about the keys stored at 
the node in the worst case. Thus in order to make the network less vulnerable to node/key capture the overall key pool 
size must be large. Since individual sensor nodes have limited memory for key storage, this reduces the probability of 
having a large number of shared keys between neighboring sensors. 

Good solutions for key pre-distribution must be memory-efficient and scalable, simultaneously ensuring that (a 
majority of) the network is connected through secure communication links and provide high resiliency to enemy 
attack so that the capture of a few sensor nodes does not (severely) compromize network communication. In this paper, 
we propose a novel solution to the key predistribution problem (labeled 2-Phase key predistribution) that exploits the 
connectivity and capture-resiliency properties of loading sensor nodes with a combination of randomly derived and 
inherited keys We evaluate our solution by analytically developing novel quantitative metrics that measure the key 
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predistribution schemes' security-performance tradeoffs in terms of the network resiliency to node/key capture, the 
number of available secure links and the key (memory) requirement per node for a given level of connectivity. We 
compare the network connectivity and security performance and show analytically and through simulations that the 
proposed 2-Phase scheme strongly favors highly secure large-composite key communication and is more resilient to 
node capture than the random scheme. We first show analytically that the invulnerability of an arbitrary (^-composite 
communication link to any number of node captures is higher in our scheme. We also derive analytical results for 
measuring the vulnerability of a g-composite link to single-node capture assuming adversaries who can use captured- 
key knowledge network-wide as well as locally and show that the 2-Phase scheme is more resilient. Finally, we present 
simulation results that show the number of exclusive keys shared between two nodes is higher while the number of 
g-composite links compromised when a given number of nodes are captured by the enemy is smaller under the 2-Phase 
scheme. 

The paper is organized as follows: Section 2 describes some of the related work on key predistribution. Section 3 
describes the proposed 2-Phase scheme. Section 4 outlines our metrics for measuring security-performance tradeoffs. 
Section 5 contains some analytical connectivity results with Section 6 containing analytical results on security of 
communication edges under node capture. Section 7 describes simulation results followed by some implementation 
issues and Conclusions in Section 9. 

II. Related Work: Overview of the Basic Random Key Distribution Scheme 

In general, key management for sensor networks consists of three phases, key pre-distribution, shared key discovery 
and path establishment. There have been several recent works on key pre-distribution |6|, Q, 1141 . 1101 . II II . The 
pioneering paper in |6| proposes a simple probabilistic key pre-distribution scheme which works as follows: A pool 
of L keys with key identifiers is generated. For each node, k « L keys are drawn at random from the pool and are 
installed into the memory of the node. The shared key discovery phase takes place after the deployment of sensor 
nodes. Each node broadcasts its key identifiers. After discovering a shared key with a neighboring node, a node 
can verify that the neighbor actually holds the key through a challenge-response protocol. A link is then established 
using the shared key. A (logical) graph with secured communication links is formed after this phase. Due to random 
distribution of keys and ad-hoc deployment of sensors, there exists a chance that the logical graph is disconnected, in 
which case sensor nodes perform range extension |2| until a connected logical graph is created. However, in terms of 
node energy, this procedure is quite expensive for a sensornet. 

In 0, the authors have presented new mechanisms for key establishment using the random key pre-distribution 
scheme of |6) as a basis. Their g-composite scheme requires that two adjacent communicating nodes must have at 
least q keys in common. This scheme provides high resiliency against small scale enemy attack. 

In 1141 . the authors evaluate the random key predistribution scheme under a variety of non-random node deploy- 
ment probability distributions. However, for many realistic sensor networks as visualized in Smart Dust [19|, node 
deployment into known topologies is an unrealistic assumption. |3 1 presents an alternative model for secure sensor 
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communication using polynomials rather than keys but the computational constraint of using such polynomials is not 
extensively evaluated. 

For sensornets with significant memory constraints, 1 12 1 proposes a deterministic subvector based key predistribu- 
tion approach based on distributed agreement using vector spaces and quorums 1 15 1. The scheme has the deterministic 
property that any two nodes sharing a key under a given mapping (out of several possible mappings) from sensor nodes 
to keys share exactly two keys (that are unique to these two nodes) in this mapping. For a given network deployment, 
two physically adjacent sensors can encrypt messages using shared keys under one or more of these mappings, where 
each mapping yields a 2-composite key. It can be shown that any adhoc deployment of sensor nodes yields very high 
connectivity with low key storage requirements per node. 

III. Two-Phase Key Predistribution Mechanism 

We now describe a novel key pre-distribution scheme (labeled 2-Phase) in which sensor nodes are preloaded with 
a combination of randomly derived and inherited keys. Our proposed 2-Phase scheme is motivated by the following 
key observations: 

• From the connectivity point of view, the probability of having a common key between two nodes decreases as the 
key pool size increases under the random pre-distribution scheme. We observe that the (probabilistic) connectivity 
of the logical graph can be increased if we can ensure that each node deterministically shares some of its keys 
with some nodes (as in the subvector scheme II12I ). 

• We hypothesize that it is better from the security point of view to pre-distribute keys in a less-random fashion such 
that whenever a node shares a key with another node, it should be likely to share a larger number of keys with 
this node, If so, the resulting network should consist of high-composite links. Note that q-composite schemes are 
more secure with increasing q. If the adversary has obtained X keys (through the capture of one or more sensor 
nodes), the probability of determining the exact q-subset of X that is used by a given communicating sensor pair 
decreases exponentially with increasing q. 

We now describe the key steps in the proposed 2-Phase key predistribution mechanism. Order the sensor nodes 
apriori in a logical queue and distribute keys in increasing order according to the rules below. 
> The first node is assigned k keys drawn randomly from the key pool of size L. 

• For every succeeding sensor node i, k keys are distributed in two consecutive phases. First, node i receives a 
predetermined fraction / (1/fc < / < 1) of its k keys drawn randomly from the key space of node i — 1. The 
remaining (1 — /) fraction of k keys are then drawn randomly from the key pool of size L — k, after excluding 
all k keys of node i — 1 from L. 

The 2-Phase scheme is designed to be biased in favor of nodes sharing several keys with their immediate pre- 
decessors and successors, through direct inheritance as well as a random component. Intuitively, this key predis- 
tribution methodolody should offer better secure connectivity in the logical graph by inducing the sharing of larger 
number of keys between nodes, thereby enabling q-composite communication for larger values of q. More suprisingly 
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however, as we show in the security analysis section, this methodolgy also provides enhanced security under node 
capture/eavesdropping by allowing for more 'exclusive' key sharing between communicating nodes. The fraction / 
(called inheritance ratio) plays a significant part in the connectivity/security of the logical graph created after node 
deployment. Note that the random key predistribution scheme is not a special case of the 2-Phase scheme with / = 0, 
since we eliminate all k keys of the previous node from regardless of the value of /. We will shortly derive relation- 
ships between 'good' values of the various parameters k, L, f etc. Finally, the proposed 2-Phase scheme is scalable 
since new sensor nodes can be assigned keys according to this rule at any time. 

Note that there is an implicit ordering of sensors based on their position in the logical queue which determines 
each nodes key set. Thus each node has a logical identifier which we will refer to as its LID. Storing a node's 
LID in memory is an implementation decision as there is an associated security-performance tradeoff. If LID's are 
stored, nodes can be restricted to forming communication links only with adjacent nodes whose LIDs are greater than a 
specifed minimum and within a specified maximum LID distance. As shown later, this will encourage the formation of 
high-composite encrypted communication that are also less vulnerable to compromization in the case of node capture. 
Conversely, storing LIDs will enable the adversary to target nodes with specific LIDs (although their positions will 
still be unknown). Therefore this becomes an implementation issue. 

IV. Metrics for Measuring Security-Performance Tradeoffs 

Since security mechanisms directly impact system performance, there is a strong need to develop a rigorous analyti- 
cal framework for measuring the security-performance tradeoffs of arbitrary key distribution schemes. These tradeoffs 
can be represented as functions of individual metrics which measure the networks 'secure' connectivity in terms of the 
number of available secure links or paths, the memory requirement in terms of keys per node for a given level of con- 
nectivity and measuring resiliency of the network to node/key capture. In this paper, we obtain some new analytical 
results on the security-performance tradeoffs of key predistribution schemes using the quantitative metrics outlined 
below. Results for the proposed 2-Phase scheme are compared with random key predistribution. 

• Connectivity Metrics 

- Logical sensor degree: We measure the logical degree of a node as the number of adjacent sensor nodes (in 
the logical graph) with which it shares at least one key. The higher the expected node degree, the better the 
connectivity of the logical graph. A high expected degree also implies a larger expected number of disjoint 
paths from any source to any destination. Multiple disjoint paths can be used to split communication and 
carry disjoint messages, thereby increasing overall data security. We show that nodes under the proposed 
2-Phase scheme have higher expected degrees as compared to random key predistribution. 

- Number of keys shared between any two neighboring nodes: This metric can be used to evaluate connectivity 
under (/-composite key communication. We show that any two sensor nodes are expected to share more keys 
and are more likey to share q keys for any value of q (thereby enabling g-composite communication), as 
compared to random key predistribution. 
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• Security Metrics 

- Exclusive Key Sharing: If two communicating nodes share one or more keys exclusively, then their commu- 
nication is invulnerable to any number of node captures. Note that the exclusivity metric can be computed 
network- wide or with respect to a local cluster 1 . Network wide exclusivity between communicating nodes 
implies resilience against a powerful adversary who can capture nodes and use the captured key information 
anywhere in the sensor network. Alternatively, we can consider a weaker adversary who can use the key 
information only within the cluster of the captured node. 

- Node Capture: We measure the impact of node capture on network security by considering the number of 
communication links that are no longer secure (i.e only use keys from the captured key pool). We analyt- 
ically determine bounds on the inheritance ratio / for which the 2-Phase scheme shows good resilience to 
network-wide as well as localized single-node capture and present simulation results that show good net- 
work resilience to multiple-node capture as well. The expected number of links compromized in these cases 
is shown to be lower for the 2-Phase scheme as compared to the random scheme. 

V. Secure Network Connectivity: Analytical Results 

Proposition 1: Let I and % > I be any two nodes in the sensornet. The expected number of keys shared by I and i 
under the 2-Phase and Random schemes, respectively, are 

k 2 



Proof: The number of common keys between any two nodes under the random scheme is the standard hyperge- 
ometric distribution with parameters k and L, whose mean is k 2 /L. For the 2-Phase scheme, let X r be the number of 
keys in common between nodes / and I + r. Then we have, 

X r+ i = 



since after selecting an expected fX r keys from the previous node, there are k — X r keys of node I left in the random 
keypool of the current node. Eff = Xi-i is the solution to the above recurrence relation with intial condition Xo = k. 
Ef f > Effi nd as expected. ■ 

1 Typical sensor networks are organized into hierarchical clusters with cluster heads, such that each node is within wireless range of other nodes 
in the cluster [71- Thus a compromised node can potentially eavesdrop on all intra-cluster communication. 
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Thus to ensure (/-composite connectivity between arbitrary nodes, a good choice is to select k and L such that 
q = k 2 1 L. Further, if / = k/L, then the expected number of common keys between any two nodes is identical under 
both schemes. 

Corollary 1: The probability that any two nodes share at least q keys and the expected (/-composite degree of 
a sensor node (i.e., number of neighbors with which it shares more than q keys) is higher under the 2-Phase key 
distribution scheme, / > k/L. 

As nodes are more likely to share multiple keys under 2-Phase, the probability of uncovering all such common 
keys (which is necessary to decipher data transmissions between the two nodes) can be shown to be lower and hence 
two-phase is more secure in this respect. 

VI. Network Resiliency against Enemy Attack: Analytical Results 

In this section, we propose some quantitative metrics for measuring the security of communication links under 
enemy attack and analytically evaluate these metrics under different adversarial models. We assume an adversary 
that is able to capture nodes and obtain full knowledge of the captured node's key space. We evaluate link security 
under a 'network-wide' adversary who can use knowledge of captured keys to compromize communication in any 
part of the network (regardless of the physical location of the captured node). Our results can be easily extended to 
analyze link vulnerability in the presence of a localized adversary who utilizes captured key knowledge locally, i.e. can 
compromize communication within a small neighborhood of the captured node (for example, its cluster as in LEACH 
0). 

A. Vulnerability Under Multiple Node Capture: Key Exclusivity 

We first evaluate the vulnerability of logical communication links in the sensornet to multiple node capture. An ob- 
vious metric for measuring this vulnerability is the degree of exclusivity of the keys used by any two neighboring nodes 
for setting up a communication link. Therefore we evaluate the probability of any two neighboring nodes containing 
exactly one network-wide exclusive key, the presence of which will render their communication link invulnerable to 
any number of (other) node captures 2 . 

Proposition 2: Key Exclusivity: In an N node sensor network, the probability that a given communication link 
between two arbitrary neighboring sensors is invulnerable to any number of network-wide node captures is given by: 

2-Phase: 
Random 

2 In general, we can compute the probability of two nodes containing at least one exclusive key, but for all practical purposes this probability drops 
off extremely rapidly for more than one exclusive key. Hence we obtain a simple lower bound on invulnerability by focusing on the presence of a 
single exclusive key. 
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Link invulnerability is higher under the 2-Phase scheme scheme for i < / < 

Proof: Let JV rand and IV 2P denote the probability that two arbitrary neighboring sensor nodes i and j com- 
municate using an exclusive key under the two key predistribution schemes. In the case of the 2-Phase scheme, i and 
j represent the LIDs of the communicating nodes. Consider a specific key a from the key pool. 

For the random scheme, the probability that both nodes i and j possess key a is (k/L) 2 while the probability that 
an arbitrary node I / {i, j} does not possess key a is 1 — ((tZi) /(&)) = I — k/L. Hence the invulnerability of the 
link between nodes i and j under any number of node captures is given by: 

For the 2-Phase scheme, the probability that key a is exclusive to nodes i and j is the probability that node 1 does not 
select key a, followed by all nodes up to node i — 1 not selecting key a conditioned on the fact that their predecessor 
node did not select key a. Node i then selects key a given that node i — 1 did not select it. Similarly all nodes after i 
conditionally do not select key a except node j. 

Let P(l c ) denote the probability that node 1 does not contain key a, f (l c ) = I — k/L since node 1 selects keys 
from the keypool first. Similarly, let P(l c \ (I — 1)) denote the probability that a node / does not contain key a given 
node / — 1 contains it, P(l c \ (I — 1)) = 1 — /, by definition. Finally, we have 

p(i c \(i-m 



p(i i (i - in 

We now consider two cases (WLOG assume j > i) 
Case 1: j > i + 1: 
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IV = P(1 C )P(2 C | l c ) • -P((i - l) c | (i - 2) c )P(i | (i - l) c )P({i + l) c | i) 
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Case 2: j = i + 1: 
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Comparing Equations |2] and [2 the probability of two nodes having a network wide exclusive key (i.e. link invul- 
nerability) is higher under the two-phase scheme as compared to the random scheme for 4 < / < j-- ^ 
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We can consider an alternative version of the 2-Phase scheme that provides much greater key exclusivity. The first 
step of key selection is the same as before, i.e. node i selects fk keys from the key space of node i — 1. However, 
in the second step, only the fk keys selected from node i — 1 are excluded from keypol L before node i selects its 
remaining k — fk keys. For this modified 2-Phase scheme called 2PWR (2-Phase with replacement), we can show the 
following: 

Proposition 3: Scalable Comparitive Exclusivity: The invulnerability of a communication edge under any number 
of node captures when keys are distributed using the 2PWR scheme is 

rIPWR _ j^rRand 



(l~f k 



Thus link invulnerability under 2PWR outperforms the random scheme as the size of the sensornet N scales upward. 
This link invulnerability is maximized when 

IN -l)£-4 

Proof: Using the same technique as in proposition [2] the probability of a given communication link con- 
taining a network-wide exclusive key under the 2PWR scheme is given by: 



2 / , \ N-2 



-2i'\\ i: _ (1 /) ( k\ ft 

(l-/r 



(i-/) 4 



jyRand ✓ « 



The value of / that maximizes the above term can then be found using elementary calculus. ■ 

While key exclusivity and (average network connectivity) are superior under the 2PWR scheme, the vulnerability 
of a link to single node capture is lower under the standard 2-Phase scheme as shown in the next section and hence 
we focus on that scheme for the rest of the paper. The choice of particular key distribution scheme with its associated 
security performance tradeoffs then becomes an implementation issue. 

The next section contains some analytical results on edge vulnerability under localized node capture when the 
average node density of the sensornet (i.e size of a network cluster) is M, Simulation results on the number of 
exclusive keys per communicating node pair in a cluster are presented in Section 7. 

B. Link Vulnerability Under Single Node Capture 

We now consider the vulnerability of communication links in the sensornet to the capture of a single node by the 
adversary. We assume that the adversary does not posess any extra knowledge about the network topology and thus 
the capture of any given node by the adversary is equally likely. 
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Let i and j be any two communicating sensors in radio range and suppose the adversary captures node I. The 
vulnerability of edge is the expectation (over all network nodes) that node I contains all the keys in common 
between i and j. We can thus define a network- wide vulnerability metric VC for arbitrary edges in the sensornet as 
follows: 



Before describing our vulnerability results, we first prove the following useful lemmas. 

Lemma 1: Let i,i — x,i + xbe arbitrary nodes in a sensornet in which keys have been predistributed according to 
the 2-Phase scheme. Let Z be any subset of keys from the keyset of node i, \Z\ < k. The probability distribution of 
the number of keys from Z that appear in nodes i— x and i+x are identical and dependent only on the LID difference 
x for both 2PWR as well as 2POR. 

Proof: Clearly, the total number of common keys between nodes i — 1 and i and between nodes i and i + 1 
follows the same probability distribution, since they are obtained in an identical manner through inheritance followed 
by keys from the random pool. Thus the number of common keys from any subset Z of i's keys also follows the same 
distribution in i — 1 and i + 1 , The lemma follows by induction on i. ■ 

The following statements follow directly from lemma ^ since the number of keys in common between any two 
nodes under 2-Phase depends only on their LID difference. 

Corollary 2: Let i and j be two arbitrary nodes in the sensornet in which keys are predistributed according to the 
2-Phase scheme, j > i. Consider nodes i — t and j + t,t> 1. 

• The number of keys in common between nodes i — t,i and j, j + t follows identical probability distributions. 

• Suppose nodes i — t,i (j, j + t, resp.), share exactly (3 keys, < (3 < k. Then the number of keys from the 
remaining keyset of i (j, resp.) present in node j (i, resp.) follow identical probability distributions. 

The above statements also hold for nodes i,i + y and j — y, j, where y < \(j — i)/2] . 

Lemma 2: Let I, i and j > i be any three nodes in a sensornet, such that i and / share exactly j3 keys, < (3 < k 
and < I < i + \{j — i)/2\. Let Z denote the set of remaining keys in node i, \Z\ — k — (3. Ez, the expected number 
of keys from Z that are present in j is given by 



Proof: Let X r represent the expected number of keys from keyset Z in node i + r (if / < i) and I + r (if 
i < I < j). Ez is obtained by solving the recurrence relation 



VC = 2 -P[node I is captured] ■ P[l contains all keys used to communicate over (i, j)] 



(5) 




X r _i( 



/Jf r _i + (k - (3 - X r _!) 



fL-k 
L-k 



) + (k-(3) 



k-fk 
L-k 
k-fk 
L-k 



with initial condition Xq = k — (3, if I < i and Xq = 0, if i < I < i + \(j — i)/2] , ■ 
Assume that node I is captured and let CRi be the Bernoulli random variable indicating whether capture of I reveals 
all common keys between communicating nodes i and j. Denote PCRi : Pr.[CRi — 1]. We now state our first 
proposition on sensornet vulnerability under single node capture. 

Proposition 4: The probability of a given communication link between neighboring sensors i and j being compro- 
mized by the capture of an arbitrary node I ^ i, j is given by 



k /L-2k+p\ , v k-p 

PCR^ t = PCR^ t < P^ + ^P^^^^-fiT+B^-j))) 

(3=0 ik-fk) \ L L / 



for t > 1 

L-2k+0\ / N k-p 



PCR% t = PCR*r t < C^ + E^^ll-Alfl-^)) 

(3=0 \k-fk) 



for 1 < t < 



L 

j 



2 



k (L-k+(3\ 
(3=0 \k) 

where Pffp and P l I \ a ^ d denote the probability that nodes I and i share exactly j3 keys under the specified key distri- 
bution scheme and B = *-tE£ ■ 

Proof: Without loss of generality, assume 1 < I < i + and let nodes I and i share exactly j3 keys. Let 

Z denote the set of remaining keys in node i, \Z\ — k — (3. Under the 2-Phase scheme, node j can first obtain keys 
from keyset Z through inheritance from its predecessor, node j — 1, and then from the random keypool of size L — k 
(obtained after removing the k keys of node j — 1) Let jmZ be a random variable denoting the number of keys from 
Z contained in node j — 1 and let PJ mZ = Pr.[jmZ = r]. Let PNCRi = 1 — PCR\ denote the probability that j 
contains at least one key from keyset Z, for diferent values of j3. Therefore, we have 

k-l k-(3 / r (k-r\ \ (k-r\ ( (L-2k+(3\ \ \ 

(3=0 r=0 \ I \fk) ) \fk) { \k-fk) ) / 

where the first term after the inner summation is the probability that at least one out of r keys is inherited by node 
j while the second term represents the complementary situation in which at least one key from keyset Z is obtained 
from the random keypool. 

Next, using the fact that A\ > (1 — /)' an d substituting in Equation[6] we have, 



12 



fc-1 k-/3 / (L-2k+p\\ 

,3=0 r-0 \ \k-fk) / 

fc-1 / //L-2fc+,3\ fc-y3 \\ 

= £« i- V^r Ef 1 -/)^ 

/3=0 \ V \k-fk) r=0 / ) 

(fc (L-2k+/3\ k-P 

/3=0 \k-fk) r=0 



Therefore we have 

fc /-L-2fc+m k-0 

pcRr<p^k + T, p "fy-fr I kY- £(w) r ^, ^r^i- w 

/3=0 \k-fk) r=0 

We obtain an efficient approximation for PCRf p as follows: Letj — i = y,j—l = m. Letp = k/L+B v (l — (k/L)) 
if I < i andp = (1 — B rn )k/L if i < I < i + [|], where _B = (/L — k)/(L—k). From lemma|2] the expected number 
of keys from Z present in node j — 1 is — (3) . For reasonably small values oik/L, we can therefore approximate the 
distribution of random variable jmZ by the standard Binomial distribution B(k — (3, p) with the same mean p(k — (3). 
Therefore, we get 

fc /L-2fc+/3\ /k-/3 /h - R\ \ 

PCRf < P^ k + £ p t(,0^fr^v- ( EC 1 - fT[ _ Vc 1 - p) k_/J_r J 



/3=0 \k-fkJ \r=0 

fc /L-2fc+/3\ 
\_k-fk_) ■ 

/3=0 \k-fk) 



p^k+^p^^v-Ki-fP)"- i^rVi- (9) 



where p is defined as above. 

By corollary PCR} p t = PCR 2 j P t , t > 1 and PCR 2P t = PCR) p t for 1 < i < [(j - i)/2"|. This defines 
PCRf p for all values of Z as specified in the statement of the proposition 3 . 

Finally, using the fact that the probability of node j not containing any key from Z is ( L / (u) under random 
key predistribution, we derive PCRf land as: 

fc IL-k+0\ 

pcR? and = if:r + E p *7 - jk (io) 

■ 

We now consider two separate but related issues: First we determine values of / which minimize the probability 
of a given communication link (i, j) being compromized under the 2-Phase scheme. Given the security-performance 
tradeoffs, the user may desire a higher level of connectivity than provided by this optimal /. Therefore as an alternate 
performance metric, we determine values of / for which this probability is lower under 2-Phase key predistribution as 
opposed to Random key predistribution. 

3 Henceforth, we will only use I < \(j — i)/2] for the remaining propositions, using this symmetry. 
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Proposition 5: PCRi, the probability of a given link (i, j) in the sensornet being compromized by capture of any 
node I is minimized by choosing an inheritance factor / that maximizes the expression 



x(l - - x(2f - f 2 )) + /(l - B t )B v {\ + fx - 2x)(l - x) 

for I = i — t and I = j + t t > 1 

x(l - B l ) [1 - x(2f - f 2 )) - + fx- 2xj\ 



for/ 



t and I = j — t 



1 < f < 



where x — k/L, j = i + y and _B = /l fc fc ■ If communicating nodes i and j are separated by a minimum distance (i.e. 
y > c, where c is a small constant), then / = x minimizes this link vulnerability. 

Proof: We first obtain a simple approximation for Equation[9] From proposition^ the expected number of keys 
in common between any two nodes i and I is given by pu = k(x + _Bl l ~'l (1 — x)). Therefore for k << L, we can 
approximate the distribution of the number of common keys (3 between i and I by the Binomial B(k,pu, (3). Now 



using 



[L-2k + /3\ 

y L --V « ( 1 ~i^ /3: ) fc ~ /3 , we can rewrite Equation^for I < \{j - i)/2\ as 



PCRf p < P 2 [ k + < 



. 1 - 2z + /a; 



fx- f{i-x)By) 



k-/3 



if I < 



k-0 



Ep=o Q(Pu) ((! -Pil)C-=j^)il ~fx + fxBi- 1 )) ' if i < I < i + [tl] 
Substituting for and further simplifying, we get 



PCRT < Pt( <k 



(11) 



' (l - [ar(l - B*) (1 - a:(2/ - J 2 )) + /(l - S f )^(l + /a; - 2or)(l - x)]) 
for I = i — t and I = j + t t > 1 

l-x(l- B r ) 1 - x(2f - p) - fBy-\\ + fx- 2x) Y 
for Z = i + t and / = j - t 1 < t < [|] 

PCR 2P is minimized by maximizing the inner term as stated in the proposition. When j — i > 4 and k « L, this 
minimum value is obtained at / = 1/k. ■ 

Proposition 6: The probability of a given link being compromized by the capture of an arbitrary node I ^ i,j in 
the sensornet is lower under the 2-Phase scheme as compared to Random key predistribution for 

1 , 2-x 

- < f < x 

k ~ J ~ l + 2x 

where x = k/L. 

Proof: We can express PCR Rand in EauationllOlas 

r>/~t r)Rand r>Ft.and 



1- - + (-) 2 

L y L' 



(12) 
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/L-k+p\ 

by using the approximation - ,j\ ' « (1 — j;) k ~^ and also approximating the hypergeometric distribution of the 



0) 

L' 

Now comparing Eauationsll2land ^2 usm g — A =t and assuming j — i > 4, we have 



number of common keys between nodes I and i by the Binomial B(k, 4) (assuming fc « L). 



PCRf p < PCRf- and iff x(l — B t ) 1 - x(2/ - p) > x(l - x) 

=^ |"l — (1 - B*)(2f - f 2 )]x > B l <Ll) 

From the above expression, it can be seen that for reasonably large values of t (i.e. the captured node's LID is not 
too close to i or j, PCR 2P < PCR Rand for larger values of / < 1. However, the worst-case for link compromization 
PCR 2P occurs when t = 1 i.e when nodes l = i + l or l=j — 1 are captured. Therefore substituting t = 1 above 
and simplifying, we get / < x(2 — x — 2/ + 'if 2 — J 3 ) which implies \ < f < x 2 ~^ x . Hence the 2-Phase scheme 
has lower vulnerability than the Random scheme for / upto 2k/ L. ■ 

From Equation ^] we can see that if the captured node is not too close to the communicating nodes (in terms 
of LID), then the 2-Phase scheme outperforms Random key predistribution for larger values of /, which in turn 
ensures higher connectivity. In particular, if the adversary is restricted to using knowledge of a captured node's 
keys within a small neighborhood such as its cluster, then we can further minimize link vulnerability to single-node 
capture by considering a modified 2-Phase scheme in which two neighboring nodes i and j with > q keys in common 
communicate only if there does not exist any other node I in the cluster such that i < I < j. Consider a sensor network 
with average node denisty M. Then the expected LID difference between node i and the nearest node (other than node 
j) is N/M. We can therefore approximate link invulnerability to single node capture as follows: 

Proposition 7: In a sensor network with average node density M, the probability that a given communication link 
(i, j) is invulnerable to single-node capture within its cluster is 1 — (1 — PCR 21 ^^ ) . 

1 M 

Simulation results in the next section illustrate link vulnerability within a cluster for different sensornet parameters. 
Finally, for an average adversary with no specific knowledge about the network topology, the probability of captur- 
ing any node I ^ i, j is 1/ (N — 2). The following proposition then directly follows proposition[6] 

Proposition 8: The vulnerability metric VC of a given communication link in an iV-node sensor network with 
parameters k and L, is lower if keys are predistributed using the 2-Phase scheme as compared to random key predis- 
tribution, | < / < x^^. 

VII. Simulation Results 

In this section, we describe some security and performance results based on simulations carried out on a 1000 node 
sensor network using a key pool L ranging from 8000- 10000 keys. The per-node key space k varies from 40- 150 keys. 
We have evaluated the 2-Phase key distribution scheme for / = 0.5. Nodes in the simulations are deployed in clusters 
as in LEACH |7| where the average node density (in a cluster) varies between 20 to 50 nodes. Figure 1 describe some 
g-composite network connectivity metrics while Figures 2-4 describe several sensornet security metrics. 



15 




(a) (b) 
Fig. 1 

Average <j-composite degree of a Node (a) 2-composite (b) 3-composite . 



Figure 1 describes the average g-composite degree of a node for different values of q. As can be seen clearly, the 
average degree is increasingly higher under 2-Phase and it outperforms the random key pre-distribution as q increases. 

Figures 2-4 describe several sensornet security metrics. Figure 2(a) illustrates a measure of communication security 
(i.e invulnerability) by describing the average number of exclusive keys per pair of nodes in a cluster. This number 
is higher for nodes under 2-Phase than using the random scheme. Figure 2(b) measures the probability that a pair 
of nodes possesses at least one exclusive key under the 2-Phase key pre-distribution scheme. This probability rises 
sharply as k, the number of keys possessed by each node increases. 

Figures 3 and 4 measure the vulnerability of communication links in a cluster under single as well as multiple node 
capture scenarios. As can be seen, the average number of links exposed to the adversary is lower under the 2-Phase 
scheme. The simulation results verify the analytical observations in Propositions 4 and 5 regarding link vulnerability. 
Lower link vulnerability for 2-Phase is explained by the fact that it is highly unlikely for captured nodes to have an 
LID adjacent to the LIDs of the communicating nodes. 

VIII. Implementation Issue: Creating Sorted Shared Key Lists 
The security of a communication link strengthens with the exclusivity of the key(s) used for encryption on this link. 
For mutual communication each pair of nodes must therefore use keys shared among least number of nodes. During 
the shared key discovery phase, each node discovers its logical neighbors i.e., the neighbors with whom it shares at 
least one key. We propose the following metric to evaluate each shared key from this point of view. 

Let k be a key shared between any two nodes i and j and let SV, (k) denote the set of nodes in the neighborhood of i 
and j which share key k. Therefore, the eligibility of this key k with respect to the pair of nodes i and j is defined as: 
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(a) (b) 

Fig. 2 

(A) AV. # EXCLUSIVE KEYS PER NODE PAIR. (B) PROB. A NODE PAIR HAS AN EXCLUSIVE KEY. 
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Fig. 3 

Average # links compromised in a cluster when (a) one (b) three nodes are captured. 
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Average # links compromised in a cluster when five nodes are captured 



The higher the value of Eij (k), the better is the key k for communication between i and j. During the shared key 
discovery phase, each node broadcasts the list of identifiers of the keys it possesses. Each node then create a separate 
list of shared keys for each of its neighbors sorted according to their eligibility values. The most eligible key should 
be used for communication until it is revoked. 



IX. Conclusion 

Efficient pre-distribution of keys to sensor nodes is a very important issue for secure communication in sensor 
networks. Connectivity and resiliency to enemy attacks must be traded off very carefully. In this paper, we present 
an analytical framework with several quantitative metrics for evaluating key predistribution schemes and determining 
their security-performance tradeoff. We also present a 2-Phased key predistribution scheme based on a combination 
of inheritance and randomness which is proved to have better tradeoffs. 
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